Forecasting methodology with structural auto-adaptive intelligent grey models

Accurate mid- and long-term petroleum products (PP) consumption forecasting is vital for strategic reserve management and energy planning. In order to address the issue of energy forecasting, a novel structural auto-adaptive intelligent grey model (SAIGM) is developed in this paper. To start with, a novel time response function for predictions that corrects the main weaknesses of the traditional grey model is established. Then, the optimal parameter values are calculated using SAIGM to increase adaptability and flexibility to deal with a variety of forecasting dilemmas. The viability and performance of SAIGM are examined with both ideal and real-world data. The former is constructed from algebraic series while the latter is made up Cameroon's PP consumption data. With its ingrained structural flexibility, SAIGM yields forecasts with RMSE of 3.10 and 1.54% MAPE. The proposed model performs better than competing intelligent grey systems that have been developed to date and is thus a valid forecasting tool that can be used to track the growth of Cameroon's PP demand.• The ability of SAIGM enhances the forecasting power of intelligent grey models to fully extracting the laws of a system, no matter the data specifications.• SAIGM is extended to include quasi-exponential series by addressing structural flexibility and parametrization concerns.• Input attributes determination and data preprocessing are not required for the proposed model.


Interest of SAIGM
Predictive outcomes for series with a homogeneous uniform exponential law are good when the conventional grey model (GM) is used [1] , but this is not the case for data with anomalous properties, such as volatilities, periodicities and sinusoidal trends [2] . Numerous works have suggested enhanced GMs, including trigonometric (sin t-GM), discrete (DGM) and seasonal (SGM) models [3 , 4] . Unfortunately, it is challenging to capture non-linear variations in some series due to the rigid structure of such systems. Therefore, it is crucial to figure out how to choose a structure that is suitable for the actual properties of the data. Another problem that requires attention is the GM's temporal reaction function.
As a result, this study develops a structural intelligent generalized model (SAIGM) that automatically adjusts to data features and completely captures a system's evolution, no matter its heterogeneity, homogeneity, volatility, linearity and periodicity. The parameters of SAIGM and the temporal response functions are calculated in this paper using differential equations. By using this method, SAIGM is relieved of the burden of dealing with parametrization concerns and produces quick simulations and exceptionally accurate forecasts.

Contributions and novelty
Qian and Sui [5] as well as Sapnken and Tamba [6] demonstrate that GMs yield precise energy forecasts, however they are still flawed when the input data are characterized by trends other than exponential growths. In the case of a seasonality index, SGM (1,1) can only represent simple periodic variations. Discrete models can capture various types of variation, but SGM cannot. Nonetheless, discrete GMs have a jump flaw in their temporal response function that results in information loss [5] . Last but not least, the design of data pretreatment models makes it difficult for them to flexibly adjust to the nonlinearities ingrained in time series, lowering precision at each forecasting period. This work creates a structural self-adaptive intelligent GM (SAIGM(1,1)) to address these limitations. This method thus offers three contributions: • SAIGM(1,1) is developed to enhance the prediction capabilities of intelligent GMs by enabling it to completely explore a system's laws of evolution, no matter the properties of time series used. • SAIGM(1,1) applies to random, linear, and quasi-exponential series in addition to pure exponential series because it addresses the structural flexibility and parametrization concerns. • SAIGM(1,1) does not require input attributes determination and data preprocessing. As a result, SAIGM lessens the reliance on modelling expertise from the position of expert systems.
The next section (Section 2) of this paper outlines the methodology's general principles; Section 3, presents the simulation performance produced using SAIGM(1,1); and Section 4 concludes the paper.

Method
In order to be able to predict the evolution of a system, it is necessary to collect sufficient information on this system beforehand, because future processes are very often linked to previous observations [7] . However, it may occasionally happen that we come across a system for which we do not have enough information. Fortunately, there are some more or less complex tools that can remedy this situation. These include linear regression [7] , extrapolation, genetic algorithms [8] , Grey Models (GM) [9] , and artificial intelligence [10] . GM stands out because it is able to generate very accurate forecasts with only four observations [11] . This is an undeniable advantage for many research fields and situations where there is a lack of data.
Unfortunately, univariate first-order grey models (GM(1,1)) only yield accurate forecasts if the data used has a homogeneous exponential pattern [12] . In practice (like in energy consumption forecasting [13] , accident prevention [14] , crop forecasting [15] etc.) such situations almost never occur. It is therefore necessary to establish a GM(1,1) that works with all types of data and produces very accurate forecasts. In general, the notation GM(p,q) designates a grey model established from an ordinary differential equation (called grey) of order p and having q variables within it. The set of variables is denoted by the notation (0) which containing components noted . The superscript (0) indicates that the variable is raw (meaning it has not yet undergone any transformation). Once (0) undergoes a transformation (an accumulation of some of its components for example), it is denoted (1) and the transformed components are denoted (1) 1 , (1) 2 , … , (1) . This being said, in the following sections, we start by exposing the failures of GM(1,1). Then, we propose a new model and its properties, before validating it with theoretical and practical cases.

Flaws of the standard GM(1,1) and its extension
We start off by recalling and outlining the shortcomings of GM(1,1), in order to explain the observation that we have made about its non-performance and why it is not generalizable. From there, we describe how to develop an intelligent auto-adaptive GM that can automatically modify its settings to lower forecasting errors no matter the type of series employed. There are also descriptions of the modelling approach, model's features, and parameter calculation. Definition 1. [16] : (0) is the system's raw entry sequence, defined by (0) = ( (0) 1 , (0) 2 , … , (0) ) ; (0) ≥ 0 ∀ = 1 , 2 , … , . The firstgeneration accumulation (1-AGO) is (1) (1) ) and calculated with Eq. (1) : In order to extract the system's evolution rule, 1-AGO is essential since it enables the removal of any potentially disruptive oscillations from the system [16] . Definition 2. [16] : Assume that the definitions of (0) and (1) remain the same . A new sequence (1) (1) ) , denoted as mean sequence derived from subsequent terms (or background value), is presented below. (1) is calculated as in Eq. (2) : where = 1 2 , thus Eq. (2) can be rewritten as in Eq. (3) : Definition 3. [16] : According to Definitions 1 and 2 , Eqs. (4) and (5) below are referred to as the grey differential equation and the traditional GM(1,1) image's equation, respectively: Actually, GM(1,1) basic version is given by Eq. (4) .
( − ) represents the development coefficient of ̂ (1) and ̂ (0) , whereas represents the grey action quantity. In general, the input variables of a grey system are external to it or must be predefined. Given that GM(1,1) is implemented with one type of sequence at a time, it uses for this purpose the sequence (1) , disregarding any external sequences (called driving values). The parameter is derived from (1) that translates the variations seen in the data into a greyed-out intention. This parameter represents the extension of the appropriate intention. Note that this feature distinguishes grey systems from black boxes. Theorem 1. [16] : Using Definitions 1 et 2 , we can calculate the vector ̂ = ( , ) by ordinary least squares [17] ( Eq. (6) ) : Theorem 2. [16] : Suppose the matrices ̂ , and are still the same as those in Theorem 1 , then, Eqs. (7) , (8) and (9) are called the time response function, the time response sequence, and the restored values respectively .
1 is the initial condition for the solution of the image equation.
The preceding analysis may lead us to believe that the standard GM(1,1) performs best with exponentially heterogeneous series. The fixed structure and parametrization of GM lead to this restriction. In addition, the second analysis concludes that it will still produce unreliable results even if the series obey the exponential homogeneous law. Therefore, the standard GM(1,1) model cannot adequately extract the law of evolution of a generalized system (real-world) as a result in both instances.

SAIGM(1,1) parameterization
The calculation of 1 , 2 and 3 in this paper uses ordinary least squares and linear algebraic formulas. The values of , and and are then deduced from 1 , 2 and 3 . The disparities between actual (1) and predicted values ̂ (1) during simulations must be kept to a minimal Δ as follows: and Eq. (13) shows that Δ results in: Ordinary least squares is used to minimize Δ in relation to 1 , 2 and 3 . The resulting system (S) is as follows: The terms in ( ) are rearranged to give than is Eq. (14) : The system ( ) can be rewritten as MΩ = Ψ, where: The system of equations ( ) is linear with a unique solution. Hence, it is possible to apply Cramer's rule to calculate 1 , 2 and 3 , and deduce , , and from the calculations [19] . Consequently, if represents the determinant of , while 1 , 2 , and 3 denote the determinants of numerators of solutions 1 , 2 and 3 , respectively, therefore: The predicted values for ̂ (1) ( Eq. (15) ) are calculated by substituting , and in Eq. (13) .

Comparing the success rates of predictions with ideal data
We consider SADGM [5] , and TDPDGM [20] which are amongst recent intelligent GMs to compare SAIGM's performances on their capacity to predict homogeneous exponential series ( 1 ) , heterogeneous exponential series ( 2 ) , quasi heterogeneous exponential series ( 3 ) , random series ( 4 ) , and linear series ( 5 ) as in [21] .   SAIGM's flexible structure helps to take into account the quirks of data from a generalized system. Competing intelligent GMs can only be applied to systems with one characteristic at a time since their structural flexibility is insufficient. This explains why TDPDGM deviates from actual data in Figs. 3-6 . However, in reality, real-world systems never show just one trait, but rather a combination of them. Competing intelligent GMs like SADGM cannot be precise for such systems. Given all this, we decided to check the performance of the new model with data from a real-world system.

Comparing the success rates of predictions with real-world data
Cameroun petroleum products (PP) consumption data for the years 1996-2012 are used to implement TDPDGM, SADGM, and SAIGM and simulate predictions for the years 2013-2017. Table 2 provides statistical forecast errors for the aforementioned intelligent GMs, while Fig. 7 displays the predictions fit curves. Fig. 7 shows that SADGM's forecasts are obviously below the system's actual evolution trend. The forecasting curves demonstrate that SAIGM performs noticeably better than TDPDGM and SADGM. More significantly, despite the fact that SADGM necessitates a  brief training phase prior to modelling, SAIGM and SADGM can capture both the system's evolution and trend in comparison to TDPDGM. The latter's forecasts rather oscillate between the two sides of the real data without settling on either of them. The absolute percentage errors (APEs) shown in Fig. 8 and performance metrics (given by Eqs. (28 -30 )) of SAIGM are significantly lower than those of TDPDGM and SADGM. These error statistics attest SAIGM's forecasting abilities and its capacity to excavate the evolution law of a real-world system. It should be noted, however, that the mathematical formulation of the TDPDGM indicates that it is not appropriate to use it with all types of series. More so, TDPDGM is a discrete model and its time response function (given by Eq. (31) ) combines an exponential function, a discrete integral, an exponential function and the polynomial function ( ) (represented by Eq. (32) ) (see Ma and Liu [20] , pp. 19).
Therefore, the TDPDGM model is not able to yield precise forecasts when implemented with series that do not approximately fit Eqs. (31) and (32) . The TDPDGM model, unlike its counterparts, has a mathematical formulation that is not flexible when implemented with input data that deviates from that for which it was designed. This proves that a preliminary study should be carried out to determine the nature of the input data before choosing which model is the most appropriate, except for the new SAIGM that this paper proposes.

Conclusion
An intelligent auto-adaptive grey structural forecasting model (SAIGM) that can be used with both ideal and generalized series is developed in this paper. An experimental verification using the predicted PP demand for Cameroon is carried out. The innovative SAIGM model has the following benefits over earlier intelligent models that have been published in the literature but suffer from inflexible framework and inadequate adaption: • SAIGM's structure is much more flexible since it automatically enhances and intelligently adjusts the parameters of the conventional GM(1,1). • SAIGM's can define the intrinsic structural model on its own and adjust it to the real properties of the modelling data, regardless of whether the data is random, linear, heterogeneous exponential, homogeneous exponential or even a mix of any of them. • The addition of a cumulative generation operator eliminates any potential disturbance in the series. Additionally, the time response function strengthens the system's evolution law, allows for its extraction, and minimizes errors caused by information loss. By ensuring simulation stability in this way, predictions are improved. In light of this, SAIGM reduces the need for modelling knowledge from the context of expert systems.
Overall, the SAIGM model presented in this paper has a high level of prediction accuracy, adaptability, feasibility, and generalizability, and demonstrates that there is still margin of progress for GM(1,1) optimization.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.